Semantic Overlay Networks for Peer-to-peer Web Search

نویسندگان

  • Tim Benke
  • Gerhard Weikum
  • Josiane Xavier Parreira
  • Sebastian Michel
  • Holger Bast
چکیده

We consider a network of peers, where each peer has its own collection obtained by individually crawling the web. When designing a distributed search system for such networks, an important task is how to efficiently perform query routing, i.e., how to find the most promising peers to answer the query. However, the efficiency of those routing techniques depends heavily on the underlying network organization. Therefore, previous works have proposed the creation of semantic overlay networks (SON), where peers are grouped according to their contents. In this work we present a rather different notion of SONs, where peers are free to decide to which peers they want and to which peers they do not want to establish a connection, following the idea of the p2pDating algorithm that was recently proposed. We consider a P2P network that uses Pastry as the underlying network infrastructure, and where peers use Nutch to perform web crawls and Lucene to build local indexes. We show that SONs can greatly reduce the amount of traffic needed for answering a query, while still maintaining a high recall.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SemreX: Efficient search in a semantic overlay for literature retrieval

The World Wide Web is growing at such a pace that even the biggest centralized search engines are able to index only a small part of the available documents on the Internet. The decentralized structure, together with the features of self-organization and fault-tolerance, makes peerto-peer networking an effective information-sharing model; however, content searching still remains a serious chall...

متن کامل

Distributed Suffix Tree for Peer-to-Peer Search

Establishing an appropriate semantic overlay on Peer-to-Peer networks to obtain both semantic ability and scalability is a challenge. Current DHT-based P2P networks are limited in their ability to support semantic search. This paper proposes the DST (Distributed Suffix Tree) overlay as the intermediate layer between the DHT overlay and the semantic overlay. The DST overlay supports search of ke...

متن کامل

p2pDating: Real life inspired semantic overlay networks for Web search

We consider a network of autonomous peers forming a logically global but physically distributed search engine, where every peer has its own local collection generated by independently crawling the web. A challenging task in such systems is to efficiently route user queries to peers that can deliver high quality results and be able to rank these returned results, thus satisfying the users’ infor...

متن کامل

GridVine: Building Internet-Scale Semantic Overlay Networks

This paper addresses the problem of building scalable semantic overlay networks. Our approach follows the principle of data independence by separating a logical layer, the semantic overlay for managing and mapping data and metadata schemas, from a physical layer consisting of a structured peer-to-peer overlay network for efficient routing of messages. The physical layer is used to implement var...

متن کامل

Semantic Overlays for P2P Web Searching

Peer-to-peer (P2P) web search has gained a lot of interest lately, due to the salient characteristics of P2P systems, namely scalability, fault-tolerance and load-balancing. However, the lack of global knowledge in a vast and dynamically evolving environment like the Web presents a grand challenge for organizing content and providing efficient searching mechanisms. Semantic overlay networks (SO...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007